--- title: Oh The Places I've Been description: Trying to visualize where I have traveled in the United States. author: Evan Canfield date: '2019-04-29' slug: where-have-i-traveled categories: [] tags: [maps, R] draft: false ---
A few years ago, as I was planning a few trips, I was trying to figure out every place I had traveled, particularly in the United States. This started with putting together every state I had gone to, but eventually the question gnawed at for long enough that I put together a list of every county I’d crossed into. That was a bit harder, but I feel confident I was able to accurately recall 95% or more of the counties I’ve been in. It helps when family vacations were generally to the same places each year, and extended family all lived fairly close.
Over the last few months I have become more and more familiar with R and visual mapping I though it would be worth it to transform a list of states and counties visited and turn it into some visualizations. Plus this would be good practice for working with spatial data.
The following libraries were used.
#Library Calls
if (!require(tidyverse)) {install.packages('tidyverse')}
library(tidyverse)
if (!require(broom)) {install.packages('broom')}
library(broom)
if (!require(tigris)) {install.packages('tigris')}
library(tigris)
if (!require(leaflet)) {install.packages('leaflet')}
library(leaflet)
if (!require(ggthemes)) {install.packages('ggthemes')}
library(ggthemes)
if (!require(RColorBrewer)) {install.packages('RColorBrewer')}
library(RColorBrewer)
if (!require(grid)) {install.packages('grid')}
library(grid)
if (!require(gridExtra)) {install.packages('gridExtra')}
library(gridExtra)
The first visual I wanted to put together map of the United States with each state I have visited shaded in. To plot this I needed latitude and longitude data for the boundaries of each state. I chose to use the Tigris package for this data.
state.sp <- states(cb = TRUE)
In order to track which states I visited to I also needed a document listing each of the states with a column indicating whether or not I had visited.
state_list_blank<- state.sp@data %>%
select(NAME, STUSPS, GEOID) %>%
distinct() %>%
mutate(Visited = 0) %>%
#Remove US Territories
filter(GEOID < 60) %>%
arrange((GEOID))
The Tigris package provides the latitude/longitude data of the US states as a SpatialPolygonsDataFrame. The initial state map visual will use ggplot. Unfortunately, ggplot cannot process a SpatialPolygonsDataFrame. Therefore, the SpatialPolygonsDataFrame must be converted into a standard dataframe.
Some additional data cleaning is also done below. US Territories as well as Alaska and Hawaii were removed. Including Alaska and Hawaii would affect the layout and readability of the plot. In the future I hope to find US state data which would plot Hawaii and Alaska in a form similar to an inset. This would allow Hawaii and Alaska to be included while maintaining the legibility of the visual. The Socviz package has this information a county level. Unfortunately I was unable to figure out how to just map states with is information.
#State
state.sp@data$id <- rownames(state.sp@data)
state_points <- tidy(state.sp)
state.df <- state_points %>%
left_join(y = state.sp@data, by = "id") %>%
filter(!(STATEFP %in% c("02", "15", "60", "66", "69", "72", "78"))) %>%
filter(long < 1)
Now that we have the state latitude and longitude data in dataframe form I filled out the document listing which states I visited. I did this manually by exporting and then importing the document as a csv.
#Blank State List Export
# state_list_blank %>%
# write.csv("./data/list_of_states_blank.csv")
#Active State List Import
state_list_active <- read.csv("./data/list_of_states_active.csv",
stringsAsFactors = FALSE)
Now that the latitude and longitude information is in dataframe form the completed list of visited US states has been imported, also in dataframe form, I joined these files to create the final dataframe for use with ggplot.
state.df_visited <-left_join(state.df, state_list_active, by = c("STUSPS"))
Before plotting the map I defined my colors. I plan on generating several different plots so I would like to define a default color scheme I can easily call to, rather than repeating the call in each plot’s code.
cols <- c("#F8F9F9", "#2471A3")
cols <- c("#F4F6F6", "#1F618D")
US_state <- ggplot(data = state.df_visited,
mapping = aes(x = long, y = lat,
fill = factor(Visited), group = group))+
geom_polygon(color = "gray80", size = 0.05) +
scale_fill_manual(values = cols) +
theme_map() +
guides(fill=FALSE)+
coord_map(projection = "albers", lat0 = 39, lat1 = 45) +
labs(title = "United States")
US_state

Voila!
Above is a map of every state in the Unites States I have visited, sort of. Technically I have been to Illinois when I transferred flights at O’Hare, but I’m not counting that.
Yet something feels off about this map. Yes, I have been to all those states, but crossing a state line is not the same as exploring a state. Therefore, instead of just plotting every state that I have crossed into, I think it would be interesting to plot every county I have been in.
To create a map of counties visited I more or less repeated the same steps performed to generate the state visual, only this time using county level data. Again, I turned to the Tigris package.
#US County Longitude/Latitiude Informaiton from socviz Package
county.sp <- counties(cb = TRUE)
In order to track which counties I have been to I also needed document listing each of the counties with a column indicating whether or not I had visited. I generated that list from Tigris as well, using it’s database of FIPS codes. FIPS codes are Federal Information Processing Standard Publication 6-4, a set of 5 digits which uniquely identifies each county or county-like equivalent in the United States. As FIPS codes are unique to each county they will act as good tracking identifier.
county_list_blank <- fips_codes
In order to create an active list of counties I’ve visited I exported the blank list of counties, with an additional column used to indicate whether I’ve visited or not.
I filled that list out manually and imported it back into this analysis.
#Export Blank List of Counites
# county_list_blank_export <- county_list_blank %>%
# mutate(Visited = 0) %>%
# write.csv("./data/list_of_counties_blank.csv")
county_list_active_import <- read.csv("./data/list_of_counties_active.csv",
stringsAsFactors = FALSE)
Like with the state latitude and longitude data, ggplot does not accept SpatialPolygonsDataFrame files. Therefore, I converted the file to a standard dataframe.
county.sp@data$id <- rownames(county.sp@data)
county_points <- tidy(county.sp, region = "id")
county.df <- county_points %>%
left_join(y = county.sp@data, by = "id") %>%
filter(!(STATEFP %in% c("02", "15", "60", "66", "69", "72", "78"))) %>%
filter(long < 1)
Now that the spatial information is in dataframe form and I have a completed Counties Visited document, also in dataframe form, I joined these files to create the final dataframe which to be used with ggplot.
#Prepare List of Blank Counties
county_list_blank_join <- county_list_blank %>%
mutate(state_code_num = as.numeric(state_code)) %>%
mutate(county_code_num = as.numeric(county_code)) %>%
select(state_code, county_code, state_code_num, county_code_num)
#Prepare Active List of Counties
county_list_active <- county_list_active_import %>%
left_join(y = county_list_blank_join,
by = c("state_code" = "state_code_num",
"county_code" = "county_code_num")) %>%
select(-X, -state_code, -county_code) %>%
rename("state_code" = "state_code.y", "county_code" = "county_code.y") %>%
select(state_code, county_code, everything()) %>%
mutate(FIPS = paste(state_code, county_code, sep = ""))
#Join County Latitude/Longitude Data with FIPS County Information - Dataframe
county.df_visited <- left_join(county.df, county_list_active,
by = c("GEOID" = "FIPS"))
I used the state latitude and longitude dataframe in this plot, as well as the county data. Overlaying the state boundaries data on the county data makes for a clearer map.
US_county <- ggplot(mapping = aes(x = long, y = lat, group = group)) +
geom_polygon(data = county.df_visited,
mapping = aes(fill = factor(Visited)),
color = "gray80", size = 0.05) +
scale_fill_manual(values = cols) +
geom_polygon(data = state.df,
color = "black",
fill = NA,
size = 0.1) +
theme_map() +
guides(fill=FALSE) +
coord_map(projection = "albers", lat0 = 39, lat1 = 45) +
labs(title = "United States")
US_county

Well, that certainly looks a lot different than the map of just states visited. Let’s take a look at the state level and county level maps side by side.
grid.arrange(US_state, US_county, ncol = 2)

I need to get out more.
I am also particularly curious of the New England region as well as North Carolina. I grew up in Connecticut and my family lived all over New England. Family trips covered a lot of ground in those states. I currently live in North Carolina, and have for over 8 years. How much of these areas, in which I have spent so much of my life, have I really seen.
#New England
county.df_visited_new_england <- county.df_visited %>%
filter(state %in% c("CT", "RI", "MA", "VT", "NH", "ME"))
state.df_new_england <- state.df %>%
filter(STUSPS %in% c("CT", "RI", "MA", "VT", "NH", "ME"))
New_England_county <- ggplot(data = county.df_visited_new_england,
aes(x = long, y = lat, fill = factor(Visited), group = group)) +
geom_polygon(color = "gray80", size = 0.05) +
scale_fill_manual(values = cols) +
geom_polygon(data = state.df_new_england,
color = "black",
fill = NA,
size = 0.25) +
theme_map() +
guides(fill=FALSE) +
coord_equal()+
labs(title = "New England")
New_England_county

#North Carolina
county.df_visited_nc <- county.df_visited %>%
filter(state == "NC")
state.df_nc <- state.df %>%
filter(STUSPS %in% c("NC"))
NC_county <- ggplot(data = county.df_visited_nc,
aes(x = long, y = lat, fill = factor(Visited), group = group)) +
geom_polygon(color = "gray80", size = 0.05) +
scale_fill_manual(values = cols) +
geom_polygon(data = state.df_nc,
color = "black",
fill = NA,
size = 0.25) +
theme_map() +
guides(fill=FALSE) +
coord_equal() +
labs(title = "North Carolina")
NC_county

Honestly, no surprises here. For New England it’s all of Connecticut, Rhode Island, and New Hampshire, with only Martha’s Vineyard and Nantucket missed in Massachusetts, and one county being missed in northeast Vermont. But I have barely scratched the surface of Maine. Based on where my extended family resided when I was growing up, this map pretty much sums up where we would have traveled for family visits and vacations.
For North Carolina, I have been everywhere in the western part of the state. I currently live in Charlotte, which is close-ish to the center of the state. Since I’ve always been a bigger fan of the mountains than the beach, heading west from Charlotte more than east tracks.
To try one last thing, I want to generate an interactive visual. I used Leaflet for this. I plant on mapping the same shaded polygons I previously plotted in ggplot on top of the leaflet map. For this plot we do not need to filter out US Territories or Alaska or Hawaii. A complete map of the world is generated regardless.
An additional wrinkle to using Leaflet is that unlike ggplot, leaflet can use SpatialPolygonsDataFrames. Therefore, we need to create a new dataset that incorporates the county SpatialPolygonsDataFrame along with the Counties Visited information.
#Join County Latitude/Longitude Data with FIPS County Information - Geo Spatial
county.sp_visited <- geo_join(spatial_data = county.sp,
data_frame = county_list_active,
by_sp = "GEOID", by_df = "FIPS")
Now that a final dataset is available I plotted it in leaflet. Just like in the ggplot county visual I also plot the state SpatialPolygonsDataFrame in order to better visualize the state lines.
pal <- colorNumeric(palette = "viridis", domain = county.sp_visited$Visited)
leaflet(options = leafletOptions(minZoom = 3)) %>%
addTiles() %>%
addPolygons(data = state.sp,
color = "#2E4053",
weight = 1,
opacity = 1,
dashArray = 1.5,
fillColor = "white",
fillOpacity = 0
) %>%
addPolygons(data = county.sp_visited,
highlight = highlightOptions(color = "white"),
fillColor = ~pal(county.sp_visited$Visited),
weight = 0.5,
opacity = 1,
color = "white",
dashArray = 2.5,
fillOpacity = 0.25,
label = ~paste(NAME)
) %>%
setView(lng = -96,
lat = 37.8,
zoom = 4) %>%
setMaxBounds(lng1 = -180,
lng2 = -60,
lat1 = 73,
lat2 = 15)